Introduction to the USTerritoryMapping Package

Introduction

This vignette gives an overview of the USTerritoryMapping R package, which seeks to make creating categorical choropleth maps of the US that include the US territories a little bit easier!

First load the package.

library(USTerritoryMapping)

Prepare Data

To use this package, you will need to have a data frame with two columns:

  1. a categorical variable coded as a factor
  2. the two-letter US Postal Service code for each state and territory (e.g., VA for Virginia or VI for Virgin Islands). fipscodes.rda is provided to facilitate #2.

For this vignette, we’ll be using the two provided datasets census.uninsured19 and cdc.cvd. census.uninsured19 provides an example of a dataset with complete data for all 50 states, D.C., and the 5 US territories. It is already in the proper format for the provided package functions.

cdc.cvd is missing values for territories and requires additional processing which we will demonstrate below.

data(census.uninsured19)
data(cdc.cvd)

We can see that census.uninsured19 has these two components: 1. “Percent.Cat”: the Percentage Ages 19 or Under with No Health Insurance categorized as a factor 2. “STUSPS”: the two letter US Postal Service code

class(census.uninsured19$Percent.Cat)
#> [1] "factor"
table(census.uninsured19$Percent.Cat)
#> 
#>   Less than 5%     5% to <10% 10% or Greater 
#>             29             22              5
head(census.uninsured19$STUSPS)
#> [1] "PA" "CA" "WV" "UT" "NY" "DC"

In cdc.cvd we are missing the territories and our fill variable (Data_Value) has not yet been prepared as a factor.

We’ll first join the dataset to the provided fips_code dataset to get the full list of jurisdictions. Then we’ll code a new factor variable for mapping.

data("fips_codes")

cdc.cvd <- fips_codes %>%
              left_join(cdc.cvd, by = c("state" = "LocationAbbr")) %>%
              mutate(data.cat = factor(
                      case_when(
                            Data_Value < 198 ~ "Q1 (166 to < 198)",
                            Data_Value >= 198 & Data_Value < 215 ~ "Q2 (198 to < 215)",
                            Data_Value >= 215 & Data_Value < 248 ~ "Q3 (215 to < 248)",
                            Data_Value >= 248 & Data_Value < 400 ~ "Q4 (248 to 326)",
                            is.na(Data_Value) ~ "Data Not Available"
                      ),
                      levels = c("Q1 (166 to < 198)", "Q2 (198 to < 215)",
                                 "Q3 (215 to < 248)", "Q4 (248 to 326)", "Data Not Available")
                  )
               )

table(cdc.cvd$data.cat)
#> 
#>  Q1 (166 to < 198)  Q2 (198 to < 215)  Q3 (215 to < 248)    Q4 (248 to 326) 
#>                 13                 13                 12                 13 
#> Data Not Available 
#>                  6
class(cdc.cvd$data.cat)
#> [1] "factor"

Mapping US with Territory Geometries

Using Census Insurance Data

Start by defining the fill category colors with their factor labels.

colors.census <- c("Less than 5%" = "#feebe2", 
                    "5% to <10%" = "#f768a1", 
                    "10% or Greater" = "#7a0177")

Then specify any required parameters of the function (see documentation for details).

map1_categorical(data = census.uninsured19, 
                 join_var = "STUSPS", 
                 fill_var = "Percent.Cat", 
                 fill_color = colors.census, 
                 legend_name = "Percent Uninsured",
                 territory_label_color = "black",
                 title = "Figure 1. Percent Uninsured, Ages <19 Years",
                 save.filepath = "saved-maps/map1-uninsure.png")

Let’s say we wanted to add a border to highlight specific states or territories. We’ll first define a vector of US postal service IDs (in this example, Oregon, Wisconsin, Virginia, and USVI) and then feed this into the border_ids parameter.

border <- c("OR", "WI", "VA", "VI")

map1_categorical(data = census.uninsured19, 
                 join_var = "STUSPS", 
                 fill_var = "Percent.Cat", 
                 fill_color = colors.census, 
                 legend_name = "Percent Uninsured",
                 title = "Figure 1. Percent Uninsured, Ages <19 Years",
                 border_ids = border,
                 border_color = "red",
                 border_linewidth = 1,
                 save.filepath = "saved-maps/map1-uninsure2.png")

Using CDC Cardiovascular Disease Mortality Data

Sometimes we may want to remove the inset box outline, which we can do by specifying inset_box_color = "white".

We also highlight an additional option of removing the territory labels by specifying territory_label_color = "white".

colors.cdc <- c("Q1 (166 to < 198)" = "#ffffcc",
                 "Q2 (198 to < 215)" = "#a1dab4",
                 "Q3 (215 to < 248)" = "#41b6c4",
                 "Q4 (248 to 326)" = "#225ea8",
                 "Data Not Available" = "grey80")

map1_categorical(data = cdc.cvd, 
                 join_var = "state",
                 fill_var = "data.cat", 
                 fill_color = colors.cdc, 
                 fill_linewidth = 1.2,
                 fill_linecolor = "black",
                 inset_box_color = "white",
                 territory_label_color = "white",
                 legend_name = "CVD Mortality Rate\nper 100,000 persons",
                 border_ids = border,
                 border_color = "red",
                 border_linewidth = 1.5,
                 save.filepath = "saved-maps/map1-cvd.png") 

Mapping US with Territory Labels

We love maps of the territory geometries, but you might also want a map with the territory labels.

colors.census <- c("Less than 5%" = "#feebe2", 
                    "5% to <10%" = "#f768a1", 
                    "10% or Greater" = "#7a0177")

border <- c("OR", "WI", "VA", "VI")

map2_categorical(data = census.uninsured19, 
                 join_var = "STUSPS", 
                 fill_var = "Percent.Cat", 
                 fill_color = colors.census, 
                 legend_name = "Percent Uninsured",
                 title = "Figure 1. Percent Uninsured, Ages <19 Years",
                 border_ids = border,
                 border_color = "red",
                 border_linewidth = 1,
                 save.filepath = "saved-maps/map2-uninsure.png")

Note that in the current package version, territory labels cannot be highlighted with a border, even when specified in the border ID vector.

colors.cdc <- c("Q1 (166 to < 198)" = "#ffffcc",
                 "Q2 (198 to < 215)" = "#a1dab4",
                 "Q3 (215 to < 248)" = "#41b6c4",
                 "Q4 (248 to 326)" = "#225ea8",
                 "Data Not Available" = "grey80")

map2_categorical(data = cdc.cvd, 
                 join_var = "state",
                 fill_var = "data.cat", 
                 fill_color = colors.cdc, 
                 fill_linewidth = 1.2,
                 fill_linecolor = "black",
                 inset_box_color = "white",
                 legend_name = "CVD Mortality Rate\nper 100,000 persons",
                 border_ids = border,
                 border_color = "red",
                 border_linewidth = 1.5,
                 save.filepath = "saved-maps/map2-cvd.png")